deep topic model
- Asia > Middle East > Jordan (0.04)
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
- Asia > Singapore (0.04)
- (2 more...)
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.15)
- North America > Canada > British Columbia > Vancouver (0.05)
Alleviating "Posterior Collapse'' in Deep Topic Models via Policy Gradient
Deep topic models have been proven as a promising way to extract hierarchical latent representations from documents represented as high-dimensional bag-of-words vectors.However, the representation capability of existing deep topic models is still limited by the phenomenon of posterior collapse, which has been widely criticized in deep generative models, resulting in the higher-level latent representations exhibiting similar or meaningless patterns.To this end, in this paper, we first develop a novel deep-coupling generative process for existing deep topic models, which incorporates skip connections into the generation of documents, enforcing strong links between the document and its multi-layer latent representations.After that, utilizing data augmentation techniques, we reformulate the deep-coupling generative process as a Markov decision process and develop a corresponding Policy Gradient (PG) based training algorithm, which can further alleviate the information reduction at higher layers.Extensive experiments demonstrate that our developed methods can effectively alleviate posterior collapse in deep topic models, contributing to providing higher-quality latent document representations.
- Asia > Middle East > Jordan (0.04)
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
- Asia > Singapore (0.04)
- (2 more...)
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.15)
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.05)
Alleviating "Posterior Collapse'' in Deep Topic Models via Policy Gradient
Deep topic models have been proven as a promising way to extract hierarchical latent representations from documents represented as high-dimensional bag-of-words vectors.However, the representation capability of existing deep topic models is still limited by the phenomenon of "posterior collapse", which has been widely criticized in deep generative models, resulting in the higher-level latent representations exhibiting similar or meaningless patterns.To this end, in this paper, we first develop a novel deep-coupling generative process for existing deep topic models, which incorporates skip connections into the generation of documents, enforcing strong links between the document and its multi-layer latent representations.After that, utilizing data augmentation techniques, we reformulate the deep-coupling generative process as a Markov decision process and develop a corresponding Policy Gradient (PG) based training algorithm, which can further alleviate the information reduction at higher layers.Extensive experiments demonstrate that our developed methods can effectively alleviate "posterior collapse" in deep topic models, contributing to providing higher-quality latent document representations.
Matching Visual Features to Hierarchical Semantic Topics for Image Paragraph Captioning
Guo, Dandan, Lu, Ruiying, Chen, Bo, Zeng, Zequn, Zhou, Mingyuan
Describing visual content in a natural-language utterance is an emerging interdisciplinary problem, which lies at the intersection of computer vision (CV) and natural language processing (NLP) ((1)). As a sentence-level short image caption ((2, 3, 4)) has a limited descriptive capacity, (5) introduce a paragraphlevel caption method that aims to generate a detailed and coherent paragraph for describing an image in a finer manner. Recent advances in image paragraph generation focus on building different types of hierarchical recurrent neural network (HRNN), e.g., LSTM ((6)), to generate the visual paragraphs. For HRNN, the high-level RNN recursively produces a sequence of sentence-level topic vectors given the image features as the input, while the low-level RNN is subsequently adopted to decode each topic vector into an output sentence. By modeling each sentence and coupling the sentences into one paragraph, these hierarchical architectures often outperform the flat models ((5)). To improve the performance and generate more diverse paragraphs, advanced methods, extending the HRNN based on generative adversarial network (GAN) ((7)) or variational auto-encoders (VAE) ((8)), are proposed by (9) and (10).
- Europe (1.00)
- North America > United States > Maryland (0.28)
Deep topic modeling by multilayer bootstrap network and lasso
It is originally formulated as a hierarchical generative model: a document is generated from a mixture of topics, and a word in the document is generated by first choosing a topic from a document-specific distribution, and then choosing the word from the topic-specific distribution. The main difficulty of topic modeling is the optimization problem, which is NPhard in the worst case due to the intractability of the posterior inference. Existing methods aim to find approximate solutions to the difficult optimization problem, which falls into the framework of matrix factorization. Matrix factorization based topic modeling maps documents into a low-dimensional semantic space by decomposing the documents into a weighted combination of a set of topic distributions: D CW where D (:,d) represents the d -th document which is a column vector over a set of words with a vocabulary size of v, C (:,g) denotes the g -th topic which is a probability mass function over the vocabulary, and W ( g,d) denotes the probability of the g -th topic in the d -th document.
- Asia > Middle East > Jordan (0.05)
- Asia > Middle East > Israel (0.04)
- Asia > Indonesia (0.04)
- (3 more...)